AITopics | noun compound

Collaborating Authors

noun compound

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Probing BERT for German Compound Semantics

Miletić, Filip, Schmid, Aaron, Walde, Sabine Schulte im

arXiv.org Artificial IntelligenceMay-21-2025

This paper investigates the extent to which pretrained German BERT encodes knowledge of noun compound semantics. We comprehensively vary combinations of target tokens, layers, and cased vs. uncased models, and evaluate them by predicting the compositionality of 868 gold standard compounds. Looking at representational patterns within the transformer architecture, we observe trends comparable to equivalent prior work on English, with compositionality information most easily recoverable in the early layers. However, our strongest results clearly lag behind those reported for English, suggesting an inherently more difficult task in German. This may be due to the higher productivity of compounding in German than in English and the associated increase in constituent-level ambiguity, including in our target compound set.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.1413

Country:

South America > Colombia > Meta Department > Villavicencio (0.05)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
(12 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Annotating Compositionality Scores for Irish Noun Compounds is Hard Work

Walsh, Abigail, Clifford, Teresa, Daly, Emma, Dunne, Jane, Davis, Brian, Cleircín, Gearóid Ó

arXiv.org Artificial IntelligenceFeb-14-2025

Noun compounds constitute a challenging construction for NLP applications, given their variability in idiomaticity and interpretation. In this paper, we present an analysis of compound nouns identified in Irish text of varied domains by expert annotators, focusing on compositionality as a key feature, but also domain specificity, as well as familiarity and confidence of the annotator giving the ratings. Our findings and the discussion that ensued contributes towards a greater understanding of how these constructions appear in Irish language, and how they might be treated separately from English noun compounds.

artificial intelligence, construction, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.10061

Country:

Europe > Spain (0.69)
North America > United States (0.68)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context

Mi, Maggie, Villavicencio, Aline, Moosavi, Nafise Sadat

arXiv.org Artificial IntelligenceOct-21-2024

Human processing of idioms relies on understanding the contextual sentences in which idioms occur, as well as language-intrinsic features such as frequency and speaker-intrinsic factors like familiarity. While LLMs have shown high performance on idiomaticity detection tasks, this success may be attributed to reasoning shortcuts in existing datasets. To this end, we construct a novel, controlled contrastive dataset designed to test whether LLMs can effectively use context to disambiguate idiomatic meaning. Additionally, we explore how collocational frequency and sentence probability influence model performance. Our findings reveal that LLMs often fail to resolve idiomaticity when it is required to attend to the surrounding context, and that models perform better on sentences that have higher likelihood. The collocational frequency of expressions also impacts performance. We make our code and dataset publicly available.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.16069

Country:

South America > Colombia > Meta Department > Villavicencio (0.05)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
(24 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models

Liu, Yang, Qin, Melissa Xiaohui, Li, Hongming, Huang, Chao

arXiv.org Artificial IntelligenceMay-5-2024

We introduce LexBench, a comprehensive evaluation suite enabled to test language models (LMs) on ten semantic phrase processing tasks. Unlike prior studies, it is the first work to propose a framework from the comparative perspective to model the general semantic phrase (i.e., lexical collocation) and three fine-grained semantic phrases, including idiomatic expression, noun compound, and verbal construction. Thanks to \ourbenchmark, we assess the performance of 15 LMs across model architectures and parameter scales in classification, extraction, and interpretation tasks. Through the experiments, we first validate the scaling law and find that, as expected, large models excel better than the smaller ones in most tasks. Second, we investigate further through the scaling semantic relation categorization and find that few-shot LMs still lag behind vanilla fine-tuned models in the task. Third, through human evaluation, we find that the performance of strong models is comparable to the human level regarding semantic phrase processing. Our benchmarking findings can serve future research aiming to improve the generic capability of LMs on semantic phrase comprehension. Our source code and data are available at https://github.com/jacklanda/LexBench

computational linguistic, expression, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2405.02861

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
North America > Canada > Ontario > Toronto (0.04)
(16 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Do Vision-Language Models Understand Compound Nouns?

Kumar, Sonal, Ghosh, Sreyan, Sakshi, S, Tyagi, Utkarsh, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-30-2024

Open-vocabulary vision-language models (VLMs) like CLIP, trained using contrastive loss, have emerged as a promising new paradigm for text-to-image retrieval. However, do VLMs understand compound nouns (CNs) (e.g., lab coat) as well as they understand nouns (e.g., lab)? We curate Compun, a novel benchmark with 400 unique and commonly used CNs, to evaluate the effectiveness of VLMs in interpreting CNs. The Compun benchmark challenges a VLM for text-to-image retrieval where, given a text prompt with a CN, the task is to select the correct image that shows the CN among a pair of distractor images that show the constituent nouns that make up the CN. Next, we perform an in-depth analysis to highlight CLIPs' limited understanding of certain types of CNs. Finally, we present an alternative framework that moves beyond hand-written templates for text prompts widely used by CLIP-like models. We employ a Large Language Model to generate multiple diverse captions that include the CN as an object in the scene described by the caption. Our proposed method improves CN understanding of CLIP by 8.25% on Compun. Code and benchmark are available at: https://github.com/sonalkum/Compun

caption, compound noun, noun, (14 more...)

arXiv.org Artificial Intelligence

2404.00419

Country:

Oceania > Australia (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Sports (1.00)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Human and Automatic Interpretation of Romanian Noun Compounds

Marinescu, Ioana, Fellbaum, Christiane

arXiv.org Artificial IntelligenceMar-10-2024

Determining the intended, context-dependent meanings of noun compounds like shoe sale and fire sale remains a challenge for NLP. Previous work has relied on inventories of semantic relations that capture the different meanings between compound members. Focusing on Romanian compounds, whose morphosyntax differs from that of their English counterparts, we propose a new set of relations and test it with human annotators and a neural net classifier. Results show an alignment of the network's predictions and human judgments, even where the human agreement rate is low. Agreement tracks with the frequency of the selected relations, regardless of structural differences. However, the most frequently selected relation was none of the sixteen labeled semantic relations, indicating the need for a better relation inventory. Keywords: Romanian noun compounds, semantic are glass bottle, ocean water, leather jacket, roles, human annotations, automatic classification, morning run, and winter cold.

category, compound, relation, (16 more...)

arXiv.org Artificial Intelligence

2403.0636

Country:

Europe > Sweden > Uppsala County > Uppsala (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.90)

Add feedback

From chocolate bunny to chocolate crocodile: Do Language Models Understand Noun Compounds?

Coil, Jordan, Shwartz, Vered

arXiv.org Artificial IntelligenceMay-24-2023

Noun compound interpretation is the task of expressing a noun compound (e.g. chocolate bunny) in a free-text paraphrase that makes the relationship between the constituent nouns explicit (e.g. bunny-shaped chocolate). We propose modifications to the data and evaluation setup of the standard task (Hendrickx et al., 2013), and show that GPT-3 solves it almost perfectly. We then investigate the task of noun compound conceptualization, i.e. paraphrasing a novel or rare noun compound. E.g., chocolate crocodile is a crocodile-shaped chocolate. This task requires creativity, commonsense, and the ability to generalize knowledge about similar concepts. While GPT-3's performance is not perfect, it is better than that of humans -- likely thanks to its access to vast amounts of knowledge, and because conceptual processing is effortful for people (Connell and Lynott, 2012). Finally, we estimate the extent to which GPT-3 is reasoning about the world vs. parroting its training data. We find that the outputs from GPT-3 often have significant overlap with a large web corpus, but that the parroting strategy is less beneficial for novel noun compounds.

computational linguistic, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2305.10568

Country:

North America > United States > Colorado > Denver County > Denver (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Oceania > Australia (0.04)
(12 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)

Add feedback

"Covid vaccine is against Covid but Oxford vaccine is made at Oxford!" Semantic Interpretation of Proper Noun Compounds

Kolluru, Keshav, Stanovsky, Gabriel, Mausam, null

arXiv.org Artificial IntelligenceOct-24-2022

Proper noun compounds, e.g., "Covid vaccine", convey information in a succinct manner (a "Covid vaccine" is a "vaccine that immunizes against the Covid disease"). These are commonly used in short-form domains, such as news headlines, but are largely ignored in information-seeking applications. To address this limitation, we release a new manually annotated dataset, ProNCI, consisting of 22.5K proper noun compounds along with their free-form semantic interpretations. ProNCI is 60 times larger than prior noun compound datasets and also includes non-compositional examples, which have not been previously explored. We experiment with various neural models for automatically generating the semantic interpretations from proper noun compounds, ranging from few-shot prompting to supervised learning, with varying degrees of knowledge about the constituent nouns. We find that adding targeted knowledge, particularly about the common noun, results in performance gains of upto 2.8%. Finally, we integrate our model generated interpretations with an existing Open IE system and observe an 7.5% increase in yield at a precision of 85%. The dataset and code are available at https://github.com/dair-iitd/pronci.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.13039

Country:

North America > United States > California (0.14)
Asia > Middle East > Iran (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.94)

Add feedback

Systematicity in GPT-3's Interpretation of Novel English Noun Compounds

Li, Siyan, Carlson, Riley, Potts, Christopher

arXiv.org Artificial IntelligenceOct-17-2022

Levin et al. (2019) show experimentally that the interpretations of novel English noun compounds (e.g., stew skillet), while not fully compositional, are highly predictable based on whether the modifier and head refer to artifacts or natural kinds. Is the large language model GPT-3 governed by the same interpretive principles? To address this question, we first compare Levin et al.'s experimental data with GPT-3 generations, finding a high degree of similarity. However, this evidence is consistent with GPT3 reasoning only about specific lexical items rather than the more abstract conceptual categories of Levin et al.'s theory. To probe more deeply, we construct prompts that require the relevant kind of conceptual reasoning. Here, we fail to find convincing evidence that GPT-3 is reasoning about more than just individual lexical items. These results highlight the importance of controlling for low-level distributional regularities when assessing whether a large language model latently encodes a deeper theory.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2210.09492

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Oceania > Australia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government > Voting & Elections (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improved statistical machine translation using monolingual paraphrases

Nakov, Preslav

arXiv.org Artificial IntelligenceSep-25-2021

We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems "for free" -- by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa -- preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33%-50% of that of doubling the amount of training data.

lifting, phrase table, translation, (14 more...)

arXiv.org Artificial Intelligence

2109.15119

Country:

Europe > Ukraine > Kyiv Oblast > Chernobyl (0.04)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
Oceania > Australia (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback